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Abstract 

The key to understanding a protein's function often lies in its conformational dynamics. We de- 
velop a coarse-grained variational model to investigate the interplay between structural transitions, 
conformational flexibility and function of N-terminal calmodulin (nCaM) domain. In this model, 
two energy basins corresponding to the "closed" apo conformation and "open" holo conformation of 
nCaM domain are connected by a uniform interpolation parameter. The resulting detailed transi- 
tion route from our model is largely consistent with the recently proposed EF/3-scaffold mechanism 
in EF-hand family proteins. We find that the N-terminal part in calcium binding loops I and 
II shows higher flexibility than the C-terminal part which form this EF/3-scaffold structure. The 
structural transition of binding loops I and II are compared in detail. Our model predicts that 
binding loop II, with higher flexibility and early structural change than binding loop I, dominates 
the conformational transition in nCaM domain. 
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INTRODUCTION 



Many protein functions fundamentally depend on structural flexibility. Complex confor- 
mational transitions, induced by ligand binding for example, are often essential to proteins 
participating in regulatory networks or enzyme catalysis. More generally, a protein's ability 
to sample a variety of conformational sub-states implies that proteins have an intrinsic flex- 
ibility and mobility that influences their function.-i^ While experimental measurement can 
offer direct dynamical information about specific residues, uncovering the detailed mecha- 
nisms controlling conformational transitions between two meta-stable states is often elusive. 
In this paper we present an analytic model that aims to clarify the relationship between 
main-chain dynamics and the mechanisms controlling conformational transitions of flexi- 
ble proteins. In particular, we examine the mechanism for the open/closed transition of 
the N-terminal domain of Calmodulin (nCaM) to explore how calcium binding and target 
recognition can be understood by changes in the mobility and the degree of partial order of 
the protein backbone. 

Calmodulin (CaM) may be an ideal model system to illustrate how conformational flexi- 
bility is a major determinant of biological function. CaM is found in all eucaryotic cells and 
functions as a multipurpose intracellular Ca^+ receptor, mediating many Ca^^-regulated 
processes. CaM is a small (148 amino acid) dumbbell shaped protein with two domains 
connected by a flexible linker. Each domain of CaM contains a pair of helix-loop-helix Ca^+ 
-binding motifs called EF-hands (helices A/B and C/D in the N-terminal domain). These 
two EF-hands are connected by a flexible B/C helix-linker (see Fig. [T]). In each domain the 
four helices of apo-CaM are directed in a somewhat antiparallel fashion giving the domains 
a relatively compact structure while leaving the Ca^"''-binding loops exposed. The conforma- 
tional change induced by binding Ca^+ can be described as a change in EF-hand interhelical 
angle (between helices A/B and C/D) from nearly antiparallel (apo, closed conformation) 
to nearly perpendicular (holo, open conformation) orientation. Further this domain open- 
ing mechanism in nCaM indicates that binding of Ca^^ occurs almost exclusively within 
EF-hands, not between them.- The structural rearrangement from closed to open exposes a 
large hydrophobic surface rich in Methionine residues responisble for molecular recognition 
of various cellular targets such as myosin light chain kinase. 

The high flexibility of CaM is essential to its function. The flexibility of the central helix 
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linking the two domains allows the activated domains to simultaneously interact with target 
peptides. The conformational flexibility of the domains themselves allow for considerable 
binding promiscuity of target peptides, a property essential to its function as a primary mes- 
senger in Ca^+ signal transduction.-i^ While similar in structure and fold, the two domains of 
CaM are quite different in terms of their flexibility, melting temperatures, and Ca^^-binding 
affinities.-*^ 

The conformational dynamics of Ca^+-loaded and Ca^"'"-free CaM are well characterized 
by solution NMR.-*^ Site specific internal dynamics monitored by model free order parame- 
ters S"^, indicate that the hehces of the apo-CaM domains are well- folded on the picosecond 
to nanosecond timescale, while the Ca^"'"-binding loops, helix-linker and termini are more 
fiexible.-S On the other hand, spin-spin relaxation (or transverse auto-relaxation) rates, R2, 
indicate that the free and bound forms of the regulatory protein exchange on the millisec- 
ond timescale.'° Akke and coworkers have investigated the rate of conformational exchange 
between the open and closed conformational substrates of C-terminal CaM (cCaM) domain 
by NMR ^^N spin relaxation experiments.— Comparison of exchange rates as a function of 
Ca^"*" concentration have established that the conformational exchange in apo-cCaM involves 
an equilibrium switching between the closed and open states that is independent of Ca^^ 
concentration.- 

X-ray crystallography temperature factors give additional insight into the conformational 
freedom and internal fiexibility of CaM in the open and closed state. Recently, Grabarek pro- 
posed a detailed mechanism of Ca^+ driven conformational change in EF-hand proteins based 
on the analysis of a trapped intermediate X-ray structure of Ca^^-bound CaM mutant.— 
This two-step Ca^"'"-binding mechanism is based on the hypothesis that Ca^"^-binding and 
the resultant conformational change in all two EF-hand domains is determined by a segment 
of the structure that remains fixed as the domain opens. This segment, called the EF-hand- 
/3-scaffold, refers to the bond network that connects the two Ca^+ ions. It includes the 
backbone and the two hydrogen bonds formed by the residues in the 8^^ position of binding 
loops (Ile27 and Ile63) and the C=0 groups of the residues in the 7*'* position of the bind- 
ing loops (Thr26 and Thr62).— Indeed, in the absence of Ca^"*", the N-terminal end of the 
binding loop is found to be poorly structured and very dynamic from NMR structuresii>i^>i^ 
and X-ray temperature factors.— Functional distinction between the two ends of the binding 
loops in the domain opening mechanism is buttressed by the great variability of the amino 
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acid sequences of the N-terminal ends of the Ca^^-binding loops compared with the more 
conserved C-terminal ends across a variety of different EF-hand Ca^+-binding proteins.— 

In this paper, we study the role of flexibility in the conformational transition of CaM 
through an extension of a coarse-grained variational model developed to characterize protein 
folding.— li^ii^ This model accommodates two meta-stable folded conformations as minima of 
the calculated free energy surface. The natural order parameters of this model, discussed in 
detail in the methods section, is well suited to describe partially ordered ensembles essential 
to the conformational dynamics of flexible proteins. Transition routes and conformational 
changes of the protein are determined by constrained minimization of a variational free 
energy surface parameterized by the degree of localization of each residue about its mean 
position. The computational time to calculate the transition route for nCaM is on the order 
of several minutes on a typical single-processor PC. 

In addition to extensive experimental work characterizing the inherent flexibility of CaM, 
our results also benefit from all atom molecular dynamics simulationai^i^S as well as recent 
coarse-grained simulations inspired by models developed to characterize protein folding.—"^ 
Although subject to systematic errors due to approximations, analytic models have the 
important advantage that the results are free of statistical noise that can obscure simulation 
results (particularly troublesome when characterizing low probability states). 

MODEL AND METHODS 

A configuration of a protein is expressed by the position vectors of the a-carbons 
of the polypepetide backbone. We are interested in describing transitions between two 
known structures denoted by {rf^^} and {rj^^}. Partially ordered ensembles of polymer 
configurations are described by a reference Hamiltonian 

ij i 

where T is the temperature and k-Q is Boltzmann's constant. Here, the first term enforces 
chain connectivity, in which the connectivity matrix, Fj^, corresponds to a freely rotating 
chain with mean bond length a = S.SAand valance angle between successive bond vectors 
set to by cos^^ = 0.8.^'^ The A^ variational parameters, {C}, control the magnitude of the 
fluctuations about a-carbon position vectors rf{ai) = aiV^^ + {l — ai)r^^ . The A^ variational 
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parameters, {a} (0 < < 1), specify residue positions as an interpolation between {rf^} 
to {rfn- 

The Boltzmann weight for a constrained chain described by Tio is proportional to 



uj{{C}, {a}) oc exp 



3 . 



2a2 



(2) 



where Gij denotes the correlations of monomers i and j relative to the mean locations, 
Gij = {6ri ■ 6rj)o/a'^ with 6ri = — Sj. Here, the correlations Gij are given by the matrix 
inverse G~-^ = Tij + CiSij, and the mean positions of each monomer Sj = GijCjr^ (aj) 
interpolate between the coordinates in each native structure, 

s, = ^ G,,G,[ajrf^ + (1 - «,)rf ]. (3) 

j 

The statistical properties of a structural ensemble can be described in terms of the first two 
moments Sj and Gij since Tio is harmonic. 

In this model, the probability for a particular configurational ensemble at temperature T 
is given by the variational free energy F({C}, {«}) = E{{G}, {a}) — TS{{G}, {a}). Here, 
S{{C}, {a}) is the entropy loss due to localizing the residues around the mean postions 

Si{C}, {a})/k^ = ^ log det G - ^ 5^ s,r,,s, + ^ G.Gu- (4) 

The energy is derived from two-body interactions between native contacts, E{{G}, {«}) = 
j] ^ijUij, where Uij is the average of the pair potential u{rij) over Hq, and e^j is the strength 
of a fully formed contact between residues i and j given by Miyazawa-Jernigan interaction 
parameters.— The sum is restricted to a set of contacts determined by pairs of residues in 
the proximity in each of the meta-stable conformations. The pair potential between two 
monomers is developed by a sum of three Gaussians u{r) = ^ge"^^''^^/^"^ + ^je"^^''"^/^'^^ — 
^jg-3Ar2/2a2^ The parameters are chosen so that u{r) has a minimum at r* = 1.6a with 
value Uij{r*) = —1 formed by the long-range attractive interactions (71 = 6.0, 0] = 0.27) 
and intermediate-range repulsive interaction (71 = 9.0, /?i = 0.54) as in Ref. Il7|. Excluded 
volume interactions are represented by a short-range repulsive potential with (5s = 3.0 and 
7s is chosen so that each contact has Ujj(0)/eo = 100, where eo is the basic energy unit of 
the Miyazawa-Jernigan scaled contacts.— The energy of a contact between residues i and j 



5 



in a partially ordered chain is given by 

tijU.j = eij{u{rij))o (5) 

3 (s.-s,)^ ' 
2an + (3k6Gij_ ■ 

In this work, we consider a two-state model in which the contacts are separated into three 
sets: {{) contacts that occur in reference structure (1) only, (ii) contacts that occur in refer- 
ence structure (2) only, and [iii) contacts in common from both reference structures. Then, 
we consider that each contact involved exclusively with only one structure is in equilibrium 
with energy from the other state (which is zero). That is, we replace the pair energy for 
contacts in sets (z) and {ii) according to 

eijUij = -ksT log [1 + exp{-eij{u{rij))o/kBT)] . (6) 

This form is analogous to coupling between conformational basins in folding-inspired molec- 
ular dynamics simulation.— i^^i^S Contacts described by Eq. [6] independently switch on or off 
depending on the conformational density characterized by a set of constraints {C,a}. 

Analysis of the free energy surface parameterized by {C, a} follows the program developed 
to describe folding:— the ensemble of structures controlling the transition is characterized by 
the monomer density at the saddlepoints of the free energy. At this point, we simplify our 
model and restrict the interpolation parameter ctj to be the same for all residues, = 
following Kim et al..— Then, the numerical problem simplifies to minimizing the free energy 
with respect to {C} rather than finding saddlepoints in {C,a}. 

To explore the nature of conformational dynamics in detail, we apply this model to the N- 
terminal domain of CaM (nCaM). In particular, we use residues numbered 4-75 of unbound 
nCaM (apo, Icfd) and bound nCaM (holo, IcU) (see Fig. [T]). In our model, we have defined 
closed nCaM (Icfd) as structure (1) and open nCaM (IcU) as structure (2). Thus, the 
interpolation parameter = 1 corresponds to the closed state, and = corresponds to 
the open state. The coordinates of the open/closed structure was rotated to minimize the 
rmsd of a-carbons between the two structures.-^ We note global alignment has the risk of 
possibly obscuring or averaging out some local structural differences. The temperature T 
for the open/closed transition is taken to be the folding temperature (Tf) of the open (holo, 
Icll) structure with k-^Tf = 2.0. For comparison, the folding temperature for closed (apo, 
Icfd) structure is k-sTf = 1.9. 
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For a given set of constraints, {C, a}, the monomer density of a partially ordered ensemble 
can be characterized by the Gaussian measure of similarity to conformation described by 



exp 



2a2 



-3/2 



exp 



2a2 1 + 

Similarly, the structural similarity to the conformation described by {rf^^} is defined as 

fs.-rfn'^ 



(7) 



pf\{C,a}] = {l + a^a 



-3/2 



exp 



2a2 l + a^a-,- 



(8) 



The structural similarity relative to the native structures given by {p^^-*} and {p'^^^jspecify 
local order parameters suitable to describing conformational transitions between metastable 
states in proteins. 

To investigate the detailed main-chain dynamics controlling the structural change in 
CaM, we characterize the relative similarity to the closed structure along the transition 
route through the normalized measure 

pPM-pr^(o) 



(9) 



pf^(i)-pfHo)' 

where p-^'*(ao) is the monomer density of the i^^ residue with respect to the closed con- 
formation (Eq. [7j). Similarly, we represent the relative structural similarity to the open 
conformation as 

pPM-pP(i) 



(10) 



pfW-pfd)' 

where p^p{aQ) is the monomer density of the i^^ residue with respect to the open con- 
formation (Eq. [H]). In the open state, pi^^-'(O) = and 'pi^'^\0) = 1, while in the closed 
state pi*^^^(l) = 1 and pi*^^)(l) = 0. To represent the structural changes more clearly, it is 
convenient to consider the difference. 



Api(ao) = Pi^^\ao) - p/^''(ao) 



<2)/ 



(11) 



for each residue. This difference shifts the relative degree of localization to be between 
Api(l) = 1 and Api(O) = —1 corresponding to the open and closed conformations, respec- 
tively. 



RESULTS 



Conformational Flexibility and Calcium Binding 

The local mean square fluctuations of a-carbon positions (related to the temperature 
factors from X-ray crystallography) are a natural set of order parameters for the reference 
Hamiltonian Hq in our model. This parameter, Bi = {6rf)Q, contains information about the 
degree of structural order and conformational flexibility of each residue. In Fig. [2] we have 
plotted Bi versus sequence number at different values of Oq, the parameter that controls 
the uniform interpolation between the open structure (ao = 0) and the closed structure 
(ao = 1). Fig. [3] shows the corresponding 3D structures of nCaM domain with the residues 
colored according to Bi. Aside from the very flexible ends of two terminal helices A and D, 
the Ca^^-binding loops and the helix linker possess the highest flexibility. The calculated 
fluctuations from our model exhibit very good qualitative agreement with X-ray temperature 
factorsi^ and simulation results^i'-° of CaM. 

Binding loops. Each EF-hand in CaM coordinates Ca^+ through a 12-residue loop: 
Asp20-Glu31 in loop I and Asp56-Glu67 in loop 11. The C-terminal ends of the loops contain 
a short /5-sheet (residues 26-28 in loop 1 and residues 62-64 in loop II) adjacent the last three 
residues that are part of the exiting helices B and D, respectively. 

As shown in Fig. [2l the loops remain relatively flexible even in the open conformation. 
The highest flexibility is near the two Glycines in position 4 of the Ca^^-binding loops I 
(Gly23) and II (Gly59). This invariable Gly residue provides a sharp turn required for the 
proper geometry of the Ca^+-binding sites.— >^ The linker between helices B and C is also 
very mobile, with the highest flexibility near residue Glu45. Taken together, the mobility 
of the loops and B/C linker indicates that the domain opening depends entirely on a set of 
inherent dynamics, or "intrinsic plasticity", of CaM.- 

A closer look at the fluctuations of the Ca^^-binding loops reveals that the N-terminal 
part of each loop is more flexible than the C-terminal part. This agrees with NMR data 
characterizing the flexibility of the N-terminal and C-terminal part of loop III and IV of 
the C-terminal domain.-iii In the transition route (from closed — > open), the N-terminal 
ends of the loops stiffen gradually. On the other hand, in the C-terminal part of the loops 
the short /3-sheet structure (residues 26-28 in loop I and 62-64 in loop II) remain rigid (see 
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Fig. [2] and [3]). Also the last three residues of the loops (residues 29-31 in loop I/helix B and 
residues 65-67 in loop Il/helix D) remain relatively rigid, stabilized by the exiting helices B 
and D respectively.— 

This immobile region, the EF-hand /5-scaffold, is central to a recent proposed mechanism 
for CaM-i^ and other EF-hand domains. Fig. [2] shows that residues Thr26 and Ile27 (in 
/3-sheet of loop I) and Thr62 and Ile63 (in /3-sheet of loop II) remain very rigid during the 
domain opening. 

It is also interesting to compare the relative flexibility of binding loop I and II. It is clear 
that binding loop II is more flexible than loop I in the both conformations (see Fig. [2] and 
El^a)). In particular, the connection between helix A and the binding loop I is much more 
rigid than the connection between helix C and the binding loop II. This large difference 
in flexibility suggests that binding loop II of nCaM is more dominate in the mechanism 
for the structural transition. A similar mechanism in C-terminal CaM domain was also 
observed from NMR studies, where the Ca^+-dependent exchange contribution is dominated 
by binding loop IV with lower S*^ (higher flexibility) than loop III.- 

Helices B and C and the B/C linker. Fig. [2] and Fig. [3] also shows that the bottom 
part of helix C (close to B/C helix linker) is very flexible in apo nCaM. Upon opening, the 
flexibility of helix C decreases significantly. [See the change in color from blue to white 
(Fig. [3](a)-(c)) at the bottom part (close to B/C helix-linker) of helix C and from white to 
red at the middle part of helix C] In contrast, the top part of helix B (close to binding 
loop I; residues 29-31) becomes more flexible than the bottom part of helix B (close to B/C 
helix-linker; residues 32-37) during closed to open transition (see Fig. [2]). We also note 
that residues 37-42 of the B/C helix-linker shows significant increase in flexibility during 
opening of the domain. This change in flexibility of the B/C helix-linker helps facilitate the 
concerted reorientation of helices B and C during the closed open transition. Similar 
behavior was also observed in molecular dynamics simulation of CaM^^ for this six-residue 
(residues 37-42) segment. 

Conformational Change and Transition Mechanism 

The results discussed in the previous section gives a picture of the closed to open transition 
with good overall agreement with experiment and simulation results on an isolated apo- 
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CaM domain. Nevertheless, the analysis has focused primarily on the difference in the 
magnitude of fluctuations of the two meta-stable states. We now turn our attention to 
the predicted transition mechanism and qualitative nature of structural changes along the 
transition route. Such a description includes: along the transition route from closed to 
open, what structural changes are predicted to occur early/late, and which are predicted 
to happen gradually/cooperatively. While such details have yet to be revealed directly 
through measurement, in principle, site-directed mutagenesis experiments can be used to 
identify kinetically important structural regions of nCaM. 

To clarify the transtition route, we introduce a structural order parameter that measures 
the similarity to the open or closed state, Ap7 given in Eq. [TTl This order parameter is 
defined so that Ap^ = 1 corresponds to the closed conformation and A'pl = — 1 corresponds 
to the open conformation of nCaM domain. Fig. H] illustrates the conformational transition 
in nCaM domain in terms of Ap7 for each residue. An alternative representation of the 
same data is shown in Fig. [5l here, the value of Ap^ is represented as colors ranging from 
red (Api = —1) to white (Ap^ = 0) to blue (Ap^ = — 1) superimposed on the interpolated 
structure for selected values of ao- 

We first notice that an early transition in the binding loops and in the central region 
of helix C evident in Fig. |H [See also the gradual change in color from blue to red in 
the structures of Fig. [5]I^a)-(d).] We also note the concerted structural change of parts of 
helices B and C and flexible B/C helix-linker (residues 31-49). In particular, the flexible 
B/C helix-linker (residues 38-44) in Fig. H] exhibits a cooperative transition. Residue Gln41 
which located in this linker region is highly mobile according to NMR data.— The change 
in color from red to blue in the B/C helix linker in Fig. [5t^a) and (b) indicates that the 
structural transition of the N-terminal part (close to helix B) of this linker occurs earlier its 
C-terminal part (close to helix C). 

Fig. m and Fig. [5] also show a delayed initiation of structural change in residues 4-7 of 
helix A, residues 27-30 of binding loop I and N-terminal part of helix B. Specifically, the 
residues near the top part of helix B (close to binding loop I) and in binding loop I, have 
very little structural change at the beginning of domain opening, with a sharp, cooperative 
transition near the end. [See the relatively slow color change (from red to blue) in this part 
of helix B and binding loop I in Fig. [5](a)-(d) .] Although, the middle part of helix C (residues 
50-52) has some limited structural change early in the transition, it remains quite immobile 
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after that. [See Fig. [Hand the early color change from red to blue in Fig. O] 

Binding loops I and II. Because of the central importance of the interactions between 
the binding loops in the recently proposed two-step Ca^+-binding mechanism, this EF/3- 
scaffold region is highlighted in Fig. O In the first step of this binding mechanism, the 
Ca^+ is immobilized by the structural rigidity in the plane of /5-sheet and the ligands from 
N-terminal part of the binding loops. In the second step, the backbone torsional fiexibility of 
the EF/3-scaffold enables repositioning of the C-terminal part of the binding loop together 
with the exiting helix (helix B in loop I and helix D in loop II).— Since the Ca^+ ions 
are not included in our model and we can not characterize backbone torsional flexibility of 
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the EF/5-scaffold, our analysis is independent of that developed in Ref. 
to open conformational transition of each binding loop is quite different in Fig. O We 
predict that the structural changes in binding loop II occur before binding loop I upon 
domain opening (see the relatively slow color change from red to blue in binding loop I 
than loop II in Fig. [6]). Since the flexibility of binding loop II is also greater, this suggests 
that during Ca^"'"-binding process the loop II is more dominates the overall conformational 
change between the closed and open state. This agrees with results based on the all atom 
molecular dynamics simulations of nCaM discussed by Vigil et al..— 

Fig. [6] also shows that the N-terminal ends of the loops have relatively an early transition 
compared to the C-terminal ends. Furthermore, the conformation change of the C-terminal 
end of binding loop I is more cooperative, presumably relying on the earlier structural change 
in binding loop II. Speciflcally, the closed state structure residue in position 9 (Thr28) of the 
loop I is very stable as shown in Fig. W^a). This is due to a hydrogen bonding between Thr28 
and Glu31. Fig. [TKa) also suggests that the structural change of Glu31 occurs before Thr28 
upon domain opening, and proceeds through the transition much more gradually. Similar 
hydrogen bonding is also present between Asn64 and Glu67 in binding loop II. Nevertheless, 
compared to the corresponding residues in loop I, the structural change of these two residues 
is quite gradual [see Fig. [71(a)] . Nevertheless, Asn64 does seem to have a somewhat sharper 
transition than Glu67. Finally, residues Gly61 and Thr62 in binding loop II exhibit little 
structural change in Fig. [6] as the domain begins to open. 

Methionine residues. The large hydrophobic binding surfaces that open in both do- 
mains of CaM are especially rich in Methionine residues, with four Methionines in each 
domain occupying nearly 46% of the total hydrophobic surface area.- These side chains as 
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well as other aliphatic residues, such as Valine, Isoleucine and Leucine, which make up the 
rest of the hydrophobic binding surface are highly dynamic in solution. The flexibility 
of the residues composing hydrophobic binding surface for target peptides explains CaM's 
high degree of binding promiscuity. Here we consider the main-chain flexibility. The four 
Methionine residues in nCaM are situated in position 36, 51, 71 and 72. The closed to open 
structural transition of residues Met36 and Met71 are similar and relatively sharp compared 
to residue Met 72 which is quite gradual as shown in Fig. Et^b). This suggests that residues 
Met36 and Met71 remains relatively buried in the beginning of the domain opening. Curi- 
ously, from Fig. Ulh) residue MetSl in the middle part of helix C at ao = 0.5, shows sudden 
increase in Ap7 during closed to open conformational change. 

Conformational Transition Rate and Order Parameter 

The one dimensional free energy profile parameterized by the interpolation parameter 
ao is shown in Fig. [HI The minimum corresponding to the open state is very shallow and 
unstable compared to the closed state. Combined molecular dynamics simulations and 
small angle X-ray scattering studies on apo nCaM and Ca^^-bound nCaM by Vigil et al.— 
have also shown that in aqueous solution the closed state dominates the population. The 
equilibrium populations for the closed and open state from our model are found to be 94% 
and 6% respectively. For comparison, the NMR measurement of apo cCaM indicate a minor 
population of 5-10%.- These results suggest that on average, the residues in the hydrophobic 
surface of CaM are well protected from solvent. 

The maximum of the free energy occurs quite close to the open state at ao ~ 0.2, 
though the barrier is very broad in terms of this reaction coordinate. We also consider 
the free energy of the global structural parameter AQ = Qi — Q2 = ^AjTi/N where A'pl 
is given in in Eq. [HI Fig. [8] shows that AQ is also a reasonable reaction coordinate for 
the transition. The barrier broadens somewhat, with the maximum free energy occurring 
around /S.Q = —0.25. In terms of the global structure, this roughly corresponds to 60%-75% 
of nCaM being similar to open state configuration in the transition state ensemble. 

Even though the open state minimum is not well isolated, we estimate the conformational 
transition rate from closed to open using the Arrhenius form, k = k^e'^^'' ^^'^'^ where AF^' 
is the free energy difference between the closed conformation and transition-state ensemble. 
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Assuming the prefactor ko = l/is~^ gives the estimate k = 40,000s~^. This value is in 
reasonable agreement with the transition rate estimate of k = 20, 000s~^ based on NMR 
exchange rate data of cCaM.- 

DISCUSSION 

The primary motivation for the work presented in this paper is to understand protein func- 
tions that involve large scale (main-chain) dynamics and flexibility. Proteins with relatively 
large conformational freedom include those in which folding and binding are coupled. as 
well as hinge bending motions^^ or proteins with high plasticity such as ion binding sites,— 
and proteins with allosteric transitions.— While not nearly as developed as the Energy 
Landscape Theory of protein folding,^'' a general thermodynamic framework for the Energy 
Landscape Theory of protein-protein binding,— large conformational transitions,^^ and 
the coupling between folding and binding^ is beginning to emerge. Aside from some noted 
exceptions,— '^'^'^'^'^ relatively little theoretical work has focused on detailed analysis of 
transition mechanisms of flexible proteins in terms of specific ensembles of kinetic pathways. 
The dynamics of conformational transitions between well-defined conformational basins are 
generally controlled by relatively low probability partially ordered ensembles. The main chal- 
lenge is to describe the transition state ensembles at the residue level giving a site-specific 
description of the transition mechanism. 

Modern NMR relaxation experiments have provided a wealth of data about internal dy- 
namics and conformational sub-states quantitatively on fast (nanosecond) and slow (micro- 
to millisecond) timescales.™ Such studies are very useful in identifying residues with high 
flexibility upon target binding, not only through movements of surface loops and side chains, 
but also by global motions of the core structure."^- These experiments, however, provide only 
a few local structural changes and have not been able to capture the molecular details nec- 
essary to fully understand the mechanism of conformational transitions. Whereas atomistic 
simulations can potentially bridge the gap on time scale up to microsecond, this timescale 
falls orders of magnitude short for slow protein dynamics (millisecond to second). Also, the 
use of atomistic approaches becomes computationally inefficient with the increased size of a 
system. 

To overcome the problems associated with all-atom simulations, many studies has demon- 
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strated the use of coarse-grained protein models with simphfied representations, such as, only 
a-carbons as point masses and simplified energy functions. Such models require much less 
computational cost making them practical to describe the conformational transitions of even 
large proteins. Analyzing the fiuctuations about a single minimum has been surprisingly 
successful in identifying relevant cooperative motions in a wide range of proteins. The com- 
monly used Tirion potential^>^ (which can be viewed as a harmonic Go-model) gives a 
simple one parameter model in which the relevant motions for the transition is identified as 
one of many low frequency normal modes.— While this approach can provide considerable 
insight, it offers a limited description of the transition because it is based only on the fluctu- 
ations about one structure. The Tirion potential has recently been extended to include two 
conformations in which the contact map deflning the potential and normal modes is updated 
as the protein is moved along a known reaction coordinatc-^ifiJ^ Local unfolding and flexibility 
is accommodated by relieving regions of high stress, "cracking" , which modifles the contact 
map. Coarse-grained simulations in which the potential interpolates between two folded- 
state biased contact maps have also been introduced recently.— "^"^"^ For example, in the 
plastic network model of Margakis and Karplus^^ the individual basins are approximated by 
the Tirion potential and are then smoothly connected by a secular equation formulation. A 
similar interpolation was considered by Okazaki et al.— Alternatively, Best et al. developed 
a two-state approximation^^ analogous to Eq. [61 These advances are similar in spirt to our 
approach, albeit with distinct approximations for the basic description of partially ordered 
ensembles. 



CONCLUSION 



In this paper, we study the intrinsic flexibility and structural change in the N-terminal 
domain of CaM (nCaM) during open to close transition. The predicted transition route from 
our model gives a detailed picture of the interplay between structural transition, conforma- 
tional flexibility and function of N-terminal calmodulin (nCaM) domain. The results from 
our model are largely consistent with the important role that the immobile EF/5-scaffold 
region plays in the transition mechanism. Dissection of the transition route of this region 
further suggests that it is the early structural change of loop II that drives the cooperative 
completion of the interactions between the loops in the open structure. 
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The strong qualitative agreement with available experimental measurements of flexibility 
is an encouraging validation of the model. Recently, the folding dynamics of zinc-metallated 
protein (azurin) was studied using a similar variational model and compared with experi- 
ments for the detail coordination reaction coupled with the entatic state. A similar future 
study of detail coordination reaction for the complete description of conformational change 
stabilized by ion binding in CaM seems very promising. Ultimately, we wish to extend this 
model to investigate the binding mechanism and kinetic paths of several peptides to Ca^"*"- 
loaded CaM. Since large conformational changes coupled to binding depends fundamentally 
on the fluctuations of partially folded conformations,— this polymer based variational for- 
malism can accommodate coupled folding and binding very naturally. 
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Figure Legends 

Figure [11 

The N-terminal domain of calmodulin (nCaM). (a) The Ca^"'"-free (apo, closed) structure, 
PDB code Icfd. (b) The Ca^^-bound (holo, open) structure, PDB code Icll. (c) The 
secondary structure of nCaM is shown with one letter amino acid sequence code for residues 
4-75. The secondary structure of nCaM is as follows: helix A (5-19), Ca^+-binding loop I 
(20-31), hehx B (29-37), B/C hehx-linker (38-44), helix C (45-55), Ca2+-binding loop II 
(56-67), helix D (65-75). Note that, the last three residues of the binding loops I and II 
are also part of the exiting helices B and D. There are short /3-sheet structures in binding 
loop I (residues 26-28) and loop II (residues 62-64). This, and other three-dimensional 
illustrations were made using Visual Molecular Dynamics (VMD).— 

Figure [3 

Fluctuations Bi = {Srf)Q = Guo? vs sequence index of nCaM for selected values of the 
interpolation parameter ao in the conformational transition route between open and closed. 
Here a = 3.8Ais the distance between successive monomers. Different are denoted by, 
red (ao = 0) open; green (ao = 0.2); blue (ao = 0.4); pink (ao = 0.6); orange (ao = 0.8) 
and black (ao = 1) closed. The secondary structure is indicated below the plot. 

Figure O 

Change in fluctuations in nCaM domain during the closed to open conformational 
transition. The 3D structure in (a) corresponds to the interpolation parameter, ao = 1 
(closed state); (b) corresponds to = 0.4 (intermediate state) and (c) corresponds to 
ao = (open state). Red corresponds to low fluctuations and blue corresponds to high. 
Here, a is the distance between successive monomers. 
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Figure 



Difference between the normalized native density Api (a measure of structural similarity) 
of each residue for different a^. The change in color from red to blue is showing the closed 

open conformational transition of nCaM. This is normalized to be —1 at the open state 
minimum (oq = 0; blue) and 1 at the closed state minimum (ao = 1; red). Below the 
secondary structure of nCaM is shown. Here, in Eq. [7] and Eq. [8] is 0.5. 

Figure [3 

Closed to open conformational transition in nCaM with different interpolation parameter 
Uq. The 3D structure in (a) corresponds to the interpolation parameter, = 0.8; (b) 
corresponds to ao = 0.6; (c) corresponds to ao = 0.4 and (d) corresponds to oq = 0.2. The 
change in color from red to blue corresponds to different values of normalized native density 
Apl (a measure of structural similarity) of each residue for different Oq- Red corresponds to 
Ap7 = 1 (closed conformation) and blue (open conformation) corresponds to Apl = —1. 

Figure [B 

Comparison of structural change in binding loops I (in bottom) and II (in top) in terms 
of the order parameter Ap^. The 3D structures in (a)-(i) corresponds to the interpolation 
parameter, ao = 0.9 -0.1 during the closed to open transition. The change in color from red 
to blue corresponds to different values of Ap^ (a measure of structural similarity) of each 
residue. Red corresponds to Ap^ = 1 (closed conformation) and blue (open conformation) 
corresponds to Apl = — 1. 

Figure 

Dynamical behavior of residues during conformational transition of nCaM. The normal- 
ized native density difference Ap7 vs ao are shown for four different group of residues. 
Structural transition of (a) residues in position 9 (Thr28 and Asn64) and position 12 (Glu31 
and Glu67) of the two binding loops; (b) four hydrophobic Methionine residues in positions 
36, 51, 71 and 72. 
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Figure O 



Free energy along the transition route. In the lower curve the abscissa is the interpolation 
parameter ao- In the upper curve the abscissa is the global structural order parameter AQ. 
The entropy across the transition is relatively constant, so that the free energy barrier is 
largely energetic. 
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